-
-
Notifications
You must be signed in to change notification settings - Fork 145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Leader Election for Housekeeping in Sharded Cluster using MongoDB #529
base: main
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## main #529 +/- ##
==========================================
+ Coverage 51.50% 51.70% +0.20%
==========================================
Files 67 68 +1
Lines 6932 7048 +116
==========================================
+ Hits 3570 3644 +74
- Misses 2891 2930 +39
- Partials 471 474 +3
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution.
I left a simple question. 🙏
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the explanation.
Could you please add the tests for the two situations I asked about?
- Lease renewal while handling a long task.
- Handling background routines when shutting down the server.
b5a41c0
to
dad28f9
Compare
I personally think we can also use K8s CronJob Object to run housekeeping in cluster mode without using leader election for housekeeping. Using K8s CronJob will ease overhead for the server cluster to process leader elections. This might have a big impact when you have a huge cluster of servers. |
dad28f9
to
23e9f1d
Compare
I forgot to re-request review on this PR 😄 Requesting it now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for applying my requests.
I personally think we can also use K8s CronJob Object to run housekeeping in cluster mode without using leader election for housekeeping.
Using K8s CronJob will ease overhead for the server cluster to process leader elections. This might have a big impact when you have a huge cluster of servers.
Using CronJob seems to be a good and straightforward approach. If we choose to use it, we can run the cluster without a leader election.
However, we should consider how to provide private APIs that are meant to be exclusively called by internal components and should not be exposed publicly. The proposed idea doesn't seem to depend on k8s, as we only need to expose the private APIs.
https://kubernetes.io/docs/concepts/workloads/controllers/cron-jobs/#concurrency-policy
How would you prefer to proceed?
@hackerwins At first glance, I thought we could introduce some sort of housekeeping-only mode on the server or CLI and run it periodically using K8s CronJob (or maybe just run a housekeeping-only server pod) since housekeeping does not require any context to receive or provide (housekeeping fetches all projects and clients from the database, and results for housekeeping only get applied to the database). I will search for more context and information to get more ideas. |
7892142
to
fdc2e1c
Compare
fbc6098
to
7d65a57
Compare
What this PR does / why we need it:
Add leader election for housekeeping in sharded cluster using mongodb.
server/backend/election
package is added for leader election, withdatabase
implementation.database
implementation uses mongoDB and TTL indexes to ensure that only one node can acquire housekeeping leader lease by preventing other nodes try to update the same document simultaneously on document expiry.yorkie-cluster
Helm Chart is configured to enable--housekeeping-leader-election
to perform leader-only housekeeping in sharded cluster.hostname
value inserver/backend/backend.go
is fixed. For more information, follow: Add user agent metrics PRWhich issue(s) this PR fixes:
Fixes #505
Special notes for your reviewer:
I have tested on K8s environment with
yorkie-cluster
Helm chart with minikube, and it worked as expected.Does this PR introduce a user-facing change?:
Additional documentation:
Checklist: